Random Forests for Spatially Dependent Data
نویسندگان
چکیده
Spatial linear mixed-models, consisting of a covariate effect and Gaussian process (GP) distributed spatial random effect, are widely used for analyses geospatial data. We consider the setting where is nonlinear. Random forests (RF) popular estimating nonlinear functions but applications RF data have often ignored correlation. show that this impacts performance adversely. propose RF-GLS, novel well-principled extension RF, effects in mixed models correlation modeled using GP. RF-GLS extends same way generalized least squares (GLS) fundamentally ordinary (OLS) to accommodate dependence models. becomes special case substantially outperformed by both estimation prediction across extensive numerical experiments with spatially correlated can be functional other types dependent like time series. prove consistency ?-mixing error processes include Matérn As byproduct, we also establish, our knowledge, first result under dependence. establish results independent importance, including general GLS optimizers data-driven function classes, uniform law large number weaker assumptions. These new tools potentially useful asymptotic analysis GLS-style estimators nonparametric regression
منابع مشابه
Spatially Coherent Random Forests
Spatially Coherent Random Forest (SCRF) extends Random Forest to create spatially coherent labeling. Each split function in SCRF is evaluated based on a traditional information gain measure that is regularized by a spatial coherency term. This way, SCRF is encouraged to choose split functions that cluster pixels both in appearance space and in image space. In particular, we use SCRF to detect c...
متن کاملBayesian Melding of Deterministic Models and Kriging for Analysis of Spatially Dependent Data
The link between geographic information systems and decision making approach own the invention and development of spatial data melding method. These methods combine different data sets, to achieve better results. In this paper, the Bayesian melding method for combining the measurements and outputs of deterministic models and kriging are considered. Then the ozone data in Tehran city are analyze...
متن کاملRandom Forests for Big Data
Big Data is one of the major challenges of statistical science and has numerous consequences from algorithmic and theoretical viewpoints. Big Data always involve massive data but they also often include data streams and data heterogeneity. Recently some statistical methods have been adapted to process Big Data, like linear regression models, clustering methods and bootstrapping schemes. Based o...
متن کاملRandom survival forests for high-dimensional data
Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman’s Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domai...
متن کاملContext-dependent feature analysis with random forests
In many cases, feature selection is often more complicated than identifying a single subset of input variables that would together explain the output. There may be interactions that depend on contextual information, i.e., variables that reveal to be relevant only in some specific circumstances. In this setting, the contribution of this paper is to extend the random forest variable importances f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Statistical Association
سال: 2021
ISSN: ['0162-1459', '1537-274X', '2326-6228', '1522-5445']
DOI: https://doi.org/10.1080/01621459.2021.1950003